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Abstract 

During the past two decades there has been a lot of interest in developing statis¬ 
tical depth notions that generalize the univariate concept of ranking to multivariate 
data. The notion of depth has also been extended to regression models and func¬ 
tional data. However, computing such depth functions as well as their contours and 
deepest points is not trivial. Techniques of computational geometry appear to be 
well-suited for the development of such algorithms. Both the statistical and the 
computational geometry communities have done much work in this direction, often 
in close collaboration. We give a short review of this work, focusing mainly on depth 
and multivariate medians, and end by listing some other areas of statistics where 
computational geometry has been of great help in constructing efficient algorithms. 


1 Depth notions 


A data set cousistiug of n uuivariate poiuts is usually rauked iu asceudiug or 
desceudiug order. Uuivariate order statistics (i.e., the ‘fcth suiallest value out of 
n’) aud derived quautities have beeu studied exteusively. The uiediau is dehued 
as the order statistic of rauk (n + l)/2 wheu n is odd, aud as the average of 
the order statistics of rauks n/2 aud (n + 2)/2 wheu n is eveu. The uiediau aud 
auy other order statistic of a uuivariate data set cau be couiputed iu 0{n) tirue. 
Geueralizatiou to higher dirueusious is, however, uot straightforward. 

Alteruatively, uuivariate poiuts ruay be rauked from the outside iuward by as- 
siguiug the most extreme data poiuts depth 1, the secoud smallest aud secoud 
largest data poiuts depth 2, etc. The deepest poiut theu equals the usual mediau 
of the sample. The advautage of this type of raukiug is that it cau be exteuded to 
higher dimeusious more easily. This sectiou gives au overview of several possible 
geueralizatious of depth aud the mediau to multivariate settiugs. Survey s of sta ¬ 
tisti cal appl icatious of multivariate data depth may be fouud iu LPS99( |. |ZS0Cll |. 
aud |Mos13I | . 
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1.1 Halfspace location depth 


Let Xn = {xi ,..., Xn} be a fin ite set of data points in The Tukey deyth or 
halfspace depth (introduced by jTuk75 | and further developed by jDG92|) of any 
point 6 in (not necessarily a data point) determines how central the point is 
inside the data cloud. The halfspace depth of 6 is defined as the minimal number 
of data points in any closed halfspace determined by a hyperplane through 6: 


hdepth[6] Xn) = min ff{i]U^Xi^ u^O}. 

||tt||=i 

Thus, a point lying outside the convex hull of Xn has depth 0, and any data point 
has depth at least 1. Figure [T] illustrates this definition for d = 2. 
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Figure 1: Illustration of the bivariate halfspace depth. Here 6 (which is not a data point 
itself) has depth 1 because the halfspace determined by u contains only one data point. 

The set of all points with depth ^ k is called the kth depth region Dk- The 
halfspace depth regions form a sequence of nested polyhedra. Each Dh is the 
intersection of all halfspaces containing at least n — k + 1 data points. Moreover, 
every data point must be a vertex of one or more depth regions. The point with 
maximal halfspace depth is called the Tukey median. When the innermost depth 
region is larger than a singleton, the Tukey median is defined as its centroid. This 
makes the Tukey median unique by construction. 
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Note that the depth regions give an indication of the s hape of the data clond. 
Based on this idea one can constrnct the bagplot |RRT99 |. a bivariate version of 
the nnivariate boxplot. Figure [2] shows such a bagplot. The cross in the white disk 
is the Tukey median. The dark area is an interpolation between two subsequent 
depth regions, and contains 50% of the data. This area (the “bag”) gives an idea 
of the shape of the majority of the data cloud. Inflating the bag by a factor of 
3 relative to the Tukey median yields the “fence” (not shown), and data points 
outside the fence are called outliers and marked by stars. Finally, the light gray 
area is the convex hull of the non-outlying data points. 


Bagplot 



Figure 2: Bagplot of the heart and spleen size of 73 hamsters. 


More generally, in the multivariate case one can define the bagdistance jHRSlhbl 
of a point x relative to the Tukey median and the bag. Assume that the Tukey 
median lies in the interior of the bag, not on its boundary (this excludes degen¬ 
erate cases). Then the bagdistance is the smallest real number A such that the 
bag inflated (or deflated) by A around the Tukey median contains the point x. 
When the Tukey median equals 0, it is shown in |HRS15b | that the bagdistance 
satishes all axioms of a norm except that ||aa:|| = |a| I |cc| | only nee ds to hold when 


|aa:;|| = |a|||cc| 
The bagdista nce is used for outlier detection 


a ^ 0. 
classification iHRSlSb 


HRS 15a I and statistical 


An often used criterion to judge the robustness of an estimator is its breakdown 
value. The breakdown value is the smallest fraction of data points that we need 
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to replace in order to move the estimator of the contaminated data set arbitrarily 
far away. The classical mean of a data set has breakdown value zero since we can 
move it anywhere by moving one observation. Note that for any estimator which 
is equivariant for translation (which is required to call it a location estimator) the 
breakdown value can be at most 1/2. (If we replace half of the points by a far-away 
translation image of the remaining half, the estimator cannot distinguish which 
were the original data.) 

The Tukey depth and the corresponding median have good statistical proper¬ 
ties. The Tukey median T* is a location estimator with breakdown value en{T*-, X„) 
^ l/((i -|- 1) for any data set in general position. This means that it remains in 
a predetermined bounded region unless n/{d + 1) or more data points are moved. 
At an elliptically symmetric distribution the breakdown value becomes 1/3 for 
large n, irrespective of d. Moreover, the halfspace depth is invariant under all 
nonsingular affine transformations of the data, making the Tukey median affine 
equivariant. Since data transformations such as rotation and rescaling are very 
common in statistics, this is an important prope rty. The statistical asymptotics 
of the Tukey median have been studied in |BH99| . 

The need for fast algorithms for the halfspace depth has only grown over the 
years, since it is curr ently be ing applied to a variety of settings such as nonpara- 
metric classification jLCL12| . A related development is the fast growing field of 
functional data analysis, where the data are functions on a univariate interval (e.g. 
time or wavelength) or on a rectangle (e.g. surfaces, images). Often the function 
values are themselves multivariate. One can then d ehne th e depth of a curve (sur¬ 
face) by integrating the depth over all points as in |Clal4|. This functional d epth 
can again be used for outlier detection and classification [HRS 1 Sal. HRSlSbj ]. but 
it requires computing depths in many multivariate data sets instead of just one. 

Remark: centerpoints. There is a close relationship between the Tukey 
depth and centerpoints, which have been long studied in computational geometry. 
In fact, Tukey depth extends the notion of centerpoint. A centerpoint is any point 
with halfspace depth ^ \n/ (d-l-l)]. A consequence of Helly’s theorem is that there 
always exists at least one centerpoint, so the depth of the Tukey median cannot 
be less than \n/[d + 1)]. 


1.2 Other location depth notions 

1. Simplicial depth ( (Liu90 |). The depth of 6 equals the number of simplices 


formed by d -|- 1 data points that contain 6. Formally, 


sdepth{9; X„) = #{(A,..., id+i)] 0 e S'), 


X 


* 1 ) 


X 


^d+1 


]}• 


The simplicial median is affine equivariant with a breakdown value bounded 
above by l/(d -|- 2). Unlike halfspace depth, its depth regions need not be 
convex. 
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2. Oja depth ( |Oia83| ). This is also called simplicial volume depth: 


odepth{6-, Xn) = (l + {volume S[0,Xi^,... ,Xi^]}) 


The corresponding median is also affine equivariant, but has zero breakdown 
value. 

3. Projection depth. We first define the outlyingness ( |DG92j ) of any point 6 
relative to the data set Xn as 

lu'^O ~ medidnAu^Xi 
Xn) = max- ^ ^^ ^- 

i|u||=i 



where the median absolute deviation (MAD) of a univariate data set 
{r/i,..., is the statistic MADj{|/j} = medianj ||/j — medianj{|/j}|. The 
outlyingness is small for centrally located points and increases if we move to¬ 
ward the boundary of the data cloud. Instead of the median and the MAD, 
also another pair (T, S) of a location and scatter estimate may be chosen. 
This leads to different notions of projection depth, all defined as 


pdepth{6-, Xn) = (1 + 0(0; Xn)) 


-1 


General projection depth is studied in |Zuo03 |. When using the median 


and the MAD, the projection depth has breakdown value 1/2 and is affine 
equivariant. Its depth regions are convex. 


4. Spatial depth_(_[Ser02(]). Spatial depth is related to multivariate quantiles 


proposed in |Cha96 


spdepth{9; Xn) = 1 — 


n 


E 

2=1 


Xi 


0 


Xi 


e\ 


The spatial median is also called the median f |Gow74 |). It has breakdown 
value 1/2, but is not affine equivariant (it is only equivariant with respect 
to translations, multiplication by a scalar factor, and orthogonal transfor¬ 
mations!. For a recent survey on the computation of the spatial median 
FFG12L 


see 


A comparison of the main properties of the different location depth medians is 
given in Table [U 
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Table 1: Comparison of several location depth medians 


MEDIAN 

BREAKDOWN VALUE 

AFFINE EQUIVARIANCE 

Tukey 

worst-case l/{d + 1) 
typically 1/3 

yes 

Simplicial 

^ l/(d -|- 2) 

yes 

Oja 

2/n Ri 0 

yes 

Projection 

1/2 

yes 

Spatial 

1/2 

no 


1.3 Arrangement and regression depth 

Following jR.H99bj | we now define the depth of a point relative to an arrangement 
of hyperplanes. A point 6 is said to have zero arrangement depth if there ex¬ 
ists a ray {0 + An; A ^ 0} that does not cross any of the hyperplanes hi in the 
arrangement. (A hyperplane parallel to the ray is counted as intersecting at in¬ 
finity.) The arrangement depth of any point 0 is defined as the minimum number 
of hyperplanes intersected by any ray from 0. Figure E] shows an arrangement of 
lines. In this plot, the points 0 and r] have arrangement depth 0 and the point 
^ has arrangement depth 2. The arra ngement depth is always constant on open 
cells and on cell edges. It was shown f |RH99bl |l that any arrangement of lines in 
the plane encloses a point with arrangement depth at least [n/3], giving rise to a 
new type of “centerpoints.” 

This notion of depth was originally defined 1 jRH99 |l in the dual, as the depth 
of a regression hyperplane He relative to a point configuration of the form = 
{(a;i,?/i),..., {Xn,yn)} in Regression depth ranks hyperplanes according to 

how well they fit the data in a regression model, with x containing the predictor 
variables and y the response. A vertical hyperplane (given by a^x = constant), 
which cannot be used to predict future response values, is called a “nonfit” and 
assigned regression depth 0. The regression depth of a general hyperplane He is 
found by rotating He in a continuous movement until it becomes vertical. The 
minimum number of data points that is passed in such a rotation is called the 
regression depth of He- Figure 0] is the dual representation of Figure [3l (For 
instance, the line 6 has slope 6 *i and intercept 6*2 and corresponds to the point 
( 6 * 1 , 6 * 2 ) in Figure El) The lines 6 and 77 have regression depth 0 , whereas the line 
^ has regression depth 2 . 

In statistics one is interested in the deepest fit or regression depth median, 
because this is a line (hyperplane) about which the data are well-balanced. The 
statistical properties of regression depth and the deepest fit are very similar to 
those of the Tukey depth and median. The bounds on the maximal depth are 
almost the same. Moreover, for both depth notions the value of the rn aximal 
depth can be used to characterize the symmetry of the distribution f jR,Sn4 |h The 
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Figure 3: Example of arrangement depth. In this arrangement of lines, the points 6 and 
r] have arrangement depth 0, whereas ^ has arrangement depth 2. (See Figure 0] for the 
dual plot.) 


breakdown value of the deepest fit is at least l/((i+ 1) and under linearity of the 
conditional median of y given x it converges to 1/3. In the next section, we will 
see that the optimal complexities for computing the depth and the median are 
also comparable to those for halfspace depth. F or a det ailed comparison of the 


properties of halfspace and regression depth, see [HRVOl 


The arrangement depth region is dehned in the primal, as the set of points 
with arrangement depth at least k. Contrary to the Tukey depth, these depth 
regions need not be convex. But nevertheless it was proved tha t there always exists 
a point with arrang ement d epth at least \n/{d+l)'\ 1 jABE'*~nnl |h An analysis-based 


Mizn2l . 


proof was given in 

Remark: arrangement levels. Arrangement depth is undirected (isotropic) 
in the sense that it is dehned as a minimum over all possible directions. If we re¬ 
strict ourselves to vertical directions u (i.e., up or down), we obtain the usual levels 
of the arrangement known in combinatorial geometry. The absence of preferential 
directions makes arrangement depth invariant under affine transformations. 
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Figure 4: Example of the regression depth of a line in a bivariate configuration of points. 
The lines 9 and rj have regression depth 0, whereas the line ^ has regression depth 2. 
(This is the dual of Figure [3l) 

2 Computing depth 

Although the definitions of depth are intuitive, the computational aspects can be 
quite challenging. The calculation of depth regions and medians is computation¬ 
ally intensive, especially for large data sets in higher dimensions. In statistical 
practice such data are quite common, and therefore reliable and efficient algo¬ 
rithms are needed. For the bivariate case several algorithms have been developed 
early on. The computational aspects of depth in higher dimensions are currently 
being explored. 

Algorithms for depth-related measures are often more complex for data sets 
which are not in general position than for data sets in general position. For exam¬ 
ple, the boundaries of subsequent halfspace depth regions are always disjoint when 
the data are in general position, but this does not hold for nongeneral position. 
Preferably, algorithms should be able to handle both the general position case and 
the nongeneral position case directly. As a quick fix, algorithms which were made 





for general position can also be applied in the other case if one hrst adds small 
random errors to the data points. For large data sets, this ‘dithering’ will have a 
limited effect on the results. 


2.1 Bivariate algorithms 


Table [2] gives an overview of algorithms, each of which has been implemented, to 
compute the depth in a given point 0 in R^. These algorithms are time-optimal, 
since th e problern of comp uting these bivariate depths has an f2(nlogn) lower 
bound r |ACG+n2^ . iLsnob^ h 


The algorithms for halfspace and simplicial depth are based on the same tech¬ 
nique. First, data points are radially sorted around 6. Then a line through 0 
is rotated. The depth is calculated by counting the number of points that are 
passed by the rotating line in a specihc manner. The planar arrangement depth 
algorithm is easiest to visualize in the regression setting. To compute the depth 
of a hyperplane He with coefficients 6, the data are hrst sorted along the x-axis. 
A vertical line L is then moved from left to right and each time a data point is 
passed, the number of points above and below He on both sides of L is updated. 


Table 2: Computing the depth of a bivariate point. 


DEPTH 

TIME COMPLEXITY 

SOURCE 

Tukey depth 

0(n log n) 

lR,R9fil 

Simplicial depth 

0(n log n) 

lR,R9fil 

Arrangement/regression depth 

0(n log n) 

iRH99l 


In general, computing a median is harder than computing the depth in a point, 
because typically there are many candidate points. For instance, for the bivariate 
simplicial median the currently best algorithm requires 0{n^) time, whereas its 
corresponding depth needs only 0{n\ogn). The simplicial median seems difficult 
to compute because there are 0{n‘^) candidate points (namely, all intersections 
of lines passing through two data points) and the simplicial depth regions have 
irregular shapes, but of course a faster algorithm may yet be found. 

Fortunately, in several important cases the median can be computed without 
computing the depth of individual points . A li near-time algorithm to compute 


a bivariate centerpoint was described in jJM94| . Table [3] gives an overview of 
algorithms to compute bivariate depth-based me dians. For the bivariate Tukey 


median the lower bound Gfnlogn) was p roved in jLS00| . and the currently fastest 
algorithm takes 0(?7,log^?7,) time f |LSn3 |h The lower bound Gfnlo gn) als o holds 
for the median of arrange ment (r egress ion) depth as shown by |LSn3bl |. Fast 


algorithms were devised by |LSn3b( | and IVMR+nS 
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Table 3: Computing the bivariate median. 


MEDIAN 

TIME COMPLEXITY 

SOURCE 

Tukey median 

Simplicial median 

Oja median 

Regression depth median 

0(n log^ re) 

0{n^) 

0(relog^ re) 
O(relogre) 

[LSn3] 

iALS+031 

lALS+n3l 

TS03bl 


The computation of bi variate halfspace depth regions has also been stud¬ 
ied. The hrst algorithm |R,R,96bl | required 0{'n?\ogn) time per depth region. 
An algorith m to coin pute all regions in 0{'n?) time is constructed and imple¬ 
mented in |MRR'*'03I |. This algorithm thus also yields the Tukey median. It 
is based on the dual arrangement of lines where to pological sweep is applied. 
A completely different approach is implemented in |KMV02| . They make di¬ 
rect use of the graphics hardware to approximate the depth regions of a set of 
points in 0{nW + W^) + nCW'^Ib l2 time , where the pixel grid is of dimension 


{2W -1- 1) X {2W -I- 1). Recently, jBR.Sll | constructed an algorithm to update 


halfspace depth and its regions when points are added to the data set. 


2.2 Algorithms in higher dimensions 

The first algorithms to compute the ha lfspace and regression depth of a given point 


m 


^ with d > 2 were constructed in R,S98| | and require 0{n'^ ^ logn) time. The 


main idea was to use projections onto a lower-dimensional space. This reduces the 
problem to computing biv ariate dep ths, for which the existing algorithms have op¬ 
timal time complexity. In jBCI-l-08| theoretical output-sensitive algorithms for the 


halfspace depth are proposed. An interesting computationa l conne ction between 
halfspace depth and rn ultivariate quantiles was provided in [hRSTo] and IkM12 


LMM1 


More recently, |DM14j | provided a generalized version of the algorithm of |R,S98 
together with C-I--I- code. For the depth regions of halfsp ace depth in higher di¬ 
mensions an algorithm was recently proposed in 
For the computation of projection depth see 
point in can be computed in time, and in 

0{n‘^) time |COOll . For higher dimensions, no better algorithm is known than the 


LZ14 


The simplicial depth of a 
^ the fastest algorithm needs 


straightforward method to compute all simplices. 

When the number of data points and dimensions are such that the above algo¬ 
rithms become infeasible, one can resort to approximat e algor ithms . For half space 


depth such approximate algorithms were proposed in jR,S98[ | and |CMW 13[| . An 


ap proximatio n to the Tukey median using steepest descent can be found in |SR,nnl | . 
In [VRH+n2 an algorithm is described to approximate the deepest regression fit 
in any dimension. 
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3 Some other statistical techniques benefitting from com¬ 
putational geometry 


Computational geometry has provided fast and reliable algorithms for many other 
statistical techniques. 

Linear regression is a frequently used statistical technique. The ordinary least 
squares regression, minimizing the sum of squares of the residuals, is easy to cal¬ 
culate, but produces unreliable results whenever one or more outliers are present 
in the data. Robust alternatives are often computationally intensive. We here 
give some examples of regression methods for which geometric or combinatorial 
algorithms have been constructed. 


1. regression. This well-known alternative to least squares regression mini¬ 
mizes the sum of the absolute values of the residuals, and is robu st to ver 


2 . 


3. 


4. 


tical outlier s. Algorithms for regression may be found in, e.g., YKI'*~88 


and PK97 


Least median of squares (LMS) regression i |Rou ^). This method minimizes 
the median of the squared residuals and has a breakdown value of 1/2. To 
compute the bivariat e LMS line, an O(n^) algorithm using topological sweep 


has been developed lEShOll. An approximation algor it hr n for t he LMS line 
was constructed in |MN^97 |. The recent algorithm of |BM14 | uses mixed 
integer optimization. 


Median slope regression i |The5n |. Sen68]). This bivariate regression tech¬ 
nique estimates the slope as the median of the slopes of all lines through 
tw o data points. An algorithm with optimal complexity Olnlogn ) is given 
in |BC98| . and a more practical randomized algorithm in |DMN92j | . 


Repeated median regression f |Sie82 |i. Median slope regression takes the me¬ 
dian over all couples (d-tuples in general) of data points. Here, this median is 
repla ced by d nested medians. For the bivariate repeated median regression 
line, |MMN98| provide an efficient randomized algorithm. 


The aim of cluster analysis is to divide a data set into clusters of similar ob¬ 
jects. Partitioning methods divide the data into k groups. Hierarchical methods 
construct a complete clustering tree, such that each cut of the tree gives a partition 
of the data se t. A sele ction of clustering methods with accompanying algorithms is 
presented in jSHR97j |. The general problem of partitioning a data set into groups 
such that the partition minimizes a given error function / is NP-hard. However, 
for some special cases efficient algorithms exist. For a small number of clusters 
in low dimensions, exact algorithms for partitioning methods can be constructed. 
Constru c ting clu stering trees is also closely related to geometric problems (see e.g.. 


Fpp97 |. Fpp98| ). 


11 








































4 Other surveys 

All results not given an explicit reference above may be traced in these surveys. 


Mosl3l |: A survey of multivariate data depth and its statistical applications. 


ShaTGj: An overview of the computational complexities of basic statistics problems 


like ranking, regression, and classihcation. 

Sma9fll |: An overview of several multivariate medians and their basic properties. 


ZSO^: A classihcation of multivariate data depths based on their statistical prop¬ 


erties. 
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